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METHOD AND COMPUTER SYSTEM FOR QUERY OPTIMIZATION 
Field of the invention 

The present invention generally relates to electronic 
5 data processing, and more particularly/ relates to 
methods, computer program products and systems for data 
retrieval • 

Background of the Invention 

Some applications provide query interfaces, that allow a 
10 user to define Boolean expressions that include 
multiple conditions for selecting data from a computer 
system, such as a database system or a file system. For 
example, the Boolean expressions can be written in a 
standard query language (SQL) . They are then sent to 
15 the computer system, where they are executed. Examples 
of such a computer system are described in the patent 
applications wo 02/061612 and WO 02/061613. A Boolean 
expression can include multiple conditions which are 
combined by Boolean operators, A condition typically 
20 includes an attribute name, an operator , and a value or 
a value range. 

For example, the following three conditions 
referring to the attribute birth_date may be combined 
with AND or OR operators • 
25 birth_date between 1938 and 1990 

birthjdate <■ 1940 

birthjiate > 1920 

Advantageously, within the system the complete 
Boolean expression is processed in disjunctive or 

30 conjunctive normal form. In this example, it is assumed 
that all conditions are combined with a Boolean AND or 
that all conditions are combined with a Boolean OR. If 
there are many conditions in the Boolean expression, it 
may happen intentionally or unintentionally that some 

35. of the conditions refer to the same attribute . 
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Some combinations of conditions referring to the 
same attribute may either lead to uninteresting results 
for logical reasons or cause more conditions than 
necessary to be passed to the computer system. The 
5 following cases may be distinguished: 

Null set : The result set is empty (zero records 
found} . This may happen if all the conditions are 
combined with AND. 

Entire search domain: The result set consists of 
10. all records loaded into or stored in the computer 
system, so the selectivity is zero. This inay happen if 
the conditions are combined with OR, 

More conditions than necessary: two or more of 
conditions referring to the same attribute have an 
15 overlap with respect to their selectivity. That is, 
there is some redundancy in the conditions of the 
Boolean expression. 

Summary of the Invention 

20 The present invention provides methods, computer 
program products, and computer systems as described by 
the independent claims to simplify a Boolean expression 
by summarizing several of its conditions referring to 
the same attribute into fewer (one or more) conditions. 

25 Providing a simplified or reduced Boolean expression in 
a query to a computer system can result in a more 
efficient selection process, since there is a reduced 
number of conditions to be evaluated and therefore a 
reduced number of potentially large . interim result sets 

3 0 to be combined. 

Further, a pre-evaluation of the Boolean 
expression can show that the combination of the 
conditions evaluates to either zero records or all 
records for. logical reasons. In this case,* there is no 

35 value at all in sending the query to the computer 
system. That is, the work load of the computer system 
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can be kept lower when using a pre -selection mechanism 
for Boolean expressions according to . the present 
invention* 

in one embodiment of the present invention, a 
5 graphical user interface (GUI) can be used to perform 
the pre - evaluation and inform a user. 

The aspects of the invention will be realized and 
attained by means of the elements and combinations 
particularly pointed out in the appended claims. Also, 
10 the described combination of the features of the 
invention is not be understood as a limitation, and all 
the features can be combined in other constellations 
without departing from the spirit of the invention. It 
is to be understood that both the foregoing general 
15 description and the following detailed description are 
exemplary and explanatory only and are not restrictive 
of the invention as described. 



20 Brief Degcription of the Drawings 

FIG. 1 is a simplified block diagram of a computer 

system that can be. used with an embodiment of 
the invention; 

25 FIG. 2 shows an example of a graphical user interface 
that can be provided by the computer system to i 
user; 

FIG. 3 is a simplified flowchart of a method that can 
be performed by an embodiment of the invention; 
30 FIG. 4 illustrates a first example of applying the 
method referring to a numerical attribute; 
FIG. 5 illustrates a second example of applying the 

method referring to a numerical attribute; and 
FIG. 6 illustrates a third example of . applying the 
35 method referring to an alphanumeric at tribute. 
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Detailed Description of the Invention 

The same reference numbers are used throughout the 
5 drawings to refer to the same or like parts. 

Definitions of terms, as used hereinafter: 

Boolean operators: 
10 operators used in Boolean statements , e.g., AND, OR. 

Relational operators: 

operators used in relational statements, e,g., 

< (less than) 
is <- (less than or equal to) 

> (greater than) 

>= (greater than or equal to) 

■ (equal to) 

<> (not equal to) 
20 t..j (between) 

] . . [ (not between) 

Relational operators may be combined with a 
Boolean NOT. Such a combined expression can be 
translated into a pure relational operator. For 
25 example, "NOT >" corresponds to "<= n . 

Condition: 

relational statement comparing data, such as numerical 
data or alphanumeric data, using one or more relational 
30 operators. 

Boolean expression: 

statement including multiple conditions that are 
combined using Boolean operators* 

35 
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FXG. 1 is a simplified block diagram of a Computer 
system 990 that can be used with an embodiment of the 
invention. The computer system 990 include?; multiple 
5 computing devices (e.g., first computing device 901 and 
second computing device 902) that communicate ewer a 
network 999 , such as a local area network iilAN) , wide 
area network (WAN), the internet, or a wireless 
network. 

10 For example, the second computing device 902 may 

be a backeud system, such as a database system, a file 
system or an application Systran, that stores data 300. 
The data 300 can also be stored anywhere iAside or 
outside of the computer system 990. 

15 the first computing device 901 may be jused to 

compose Boolean egressions 310 to be used in a QUERY 
for retrieving selected data from the second computing 
device 902 . For example, the first computing device 901 
may be a front end computer that provides a graphical 

20 user interface (GUI) to a loser. 

In one embodiment of the invention, the first 
computing device 901 can run a computer program product 
loaded into a memory of the first computing device to 
perform a method 400 for logically evaluating the 

25 Boolean expression 310 and create a reduced Boolean 
expression 320 that is used in the query statement 
instead. The computer program product includes various 
portions of computer program instructions tfliat cause at 
least one processor of the first computing device 901 

30 to execute corresponding steps and functions of the 
method 400. 

In one aspect of the invention, algorithms and 
methods described in the following description are used 
to reduce the number of conditions in the Boolean 
35 statement 3X0 by eliminating logical redundancies ~ 
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In another aspect of the invention, t)ie first 
computing device 901 can avoid sending queries to the 
second computing device 902 if it is clear in advance 
from a logical point of view that, independently of the 
5 given data, the query result set will either be empty 
or comprise all the records available in the second 
computing device 902 , bo that there is no selectivity . 
A corresponding notification may be sent to the user. 
The user can be a human user but can also tie a further 
10 computing device communicating with the f irst computing 
device 301 over the network 999. 

PIG. 2 shows an example of a GUI that can be provided 
by the computer system 990 to a human user. The GUI can 
15 be presented to the user on an output device 950 , such 
as a monitor. In the example, the output device 950 is 
connected to the first computing device .901 over the 
network 999. However, the output device 950 can also be 
an integral part of any computing device in the 
20 computer system 990 , in which case the communication 
between the computing device and the output device 
would advantageously be handled through an internal 
bus. In the following examples, it is assumed that the 
first computing device implements the GUI. However, the 
25 GUI may also be implemented by any other computing 
device in the computer system 990. In this case the 
functionality described hereinafter as being performed 
by the first computing device 901 would be performed by 
the other computing device. 
30 The GUI can provide to the user specific user 

interface components (e.g., ATTRIBUTES column , 
OPERATION column, VALUE1 column, VALUE2 cqlumn, AND /OR 
radio buttons) that allow the user to compose Boolean 
expressions. For example, the user can enter selection 
35 attributes (referred to hereinafter as attributes) , 
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such as COUNTRY, BIRTH DATE, or LAST NAME in the 

i 

ATTRIBUTES column. Relationship operators can be 
entered in the OPERATION column. VALUE 1 and VALUE2 can 
be used to enter condition values with respect to the 
5 attribute and the relationship operator in the 
corresponding row. Advantageously, the user can select 
some or all entries to be. made from, corresponding 
reservoirs that can be visualized, for example, in a 
drop -down list box or a value help pop-up.* For example, 
10 the GUI can support the user to compose the Boolean 
expression 310 in disjunctive or conjunctive normal 
forme by using the AND/OR radio buttons accordingly. 

In the example, three conditions referring to the 
attribute BIRTH DATS have been entered. 

15 BIRTH DATE [..] 1938 and 1990 

BIRTH DATE <= 1940 
BIRTH DATE > 1920 

Assuming that they are combined with a Boolean AND 
(e.g. r by selecting the corresponding radio button), 

20 from a logical point of view they can be replaced by a 
single condition: BIRTH DATE [..] 1938 and 1940. This 
condition becomes part of the reduced Boolean 
expression 320. The reduced Boolean expression 320 may 
either be displayed on the UI or performed behind the 

25 scenes, so that the UI does not display the change but 
simply sends the optimized QUERY including' the reduced 
Boolean expression 320 to the second computing device 
902. 

30 PIG, 3 is a simplified flowchart of the method 400 for 
logically evaluating a Boolean expression 3*10 used in a 
query statement that can be performed by sin embodiment 
of the invention* For example, the method 400 can be 
executed by the first computing device 901. 

35 The Boolean expression 310 refers to an attribute 

and includes a plurality of conditions .Ci, i=l_m (all 
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indices used in the description are elements of the 
natural numbers) comprising the steps. 

In the receiving step 410, the computer system 990 
(cf. PIG. 1) receives the Boolean expression 310- For 
5 example, as described in PIG, 2, a user enters the 
Boolean expression in a corresponding Gux » 
Alternatively, the Boolean expression can. be received 
by any computing device in the computer system 990 from 
another computing device. 
10 la the decomposing step 420 , the Boolean 

expression is decomposed into the plurality of 
conditions Ci, i-l-jiu The computer system 990 can 
achieve that by using a parser that is able to 
recognize Boolean and relationship operators, 
15 Fo * each condition ci of the plurality the 

following steps are performed; 

Extract ding 421 at least one condition value 
referring to the attribute from the condition Ci, The 
at least one condition value defines a veJLue range of 
20 the condition, in the example of fig. 2, the first 
condition includes the two condition values 1938 and 
1990 to be retrieved- In this case, one condition value 
represents the minimum condition value a;ad the other 
condition value represents the maximum condition value 
25 of the value range covered by the firBt condition. The 
second condition includes only one condition value 
1940, defining the maximum condition value of the value 
range of the second condition. Depending on the 
relationship operator used in the condition, the 
30 maximum and/or minimum condition values ar«» included in 
or excluded from the value range of the condition. In 
case the equal to («) operator is used, the value range 
corresponds to a single value, namely the corresponding 
identity condition value. 
35 Inserting 422 the at least one condition value in 

a condition value list in sorted order. The condition 
value list can be implemented as a data structure, such 
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as for example a list structure or tabSLe structure. 
Advantageously, a linear, bi-directional pointer list 
of objects is used. In this case, two pointers are used 
between two list elements, pointing in . opposite 
5 directions . This allows the list to be . traversed in 
both directions. Each object represents a condition 
value. The insertion in sorted order can ha achieved by 
either inserting the condition value immediately at its 
final location or inserting it anywhere, e.g., at the 

10 end of the data structure, and applying au appropriate 
sort algorithm. 

Initializing 423 a relationship vector of the at 
least one condition value. In one implementation the 
relationship vector includes a LESS THAN Component, an 

15 BQOTiL to component, and a GREATER THAN component. 
Advantageously, each of these three components is 
realized as a counter. In this implementation, each 
relationship vector component of the at least one 
condition value is set to an initial value if the 

20 condition list has no further condition value. 
Advantageously, the initial value is zero but other 
values can be used instead. In other words, the 
relationship vector components of the first condition 
value that is added to an empty condition value list 

25 are all set to the same initial value. 

If the condition value list already includes other 
condition values, two alternative mechanisms may be 
used for initializing. In a first, alternative r each 
relationship vector component of the inserted condition 

30 value is set to the LESS THAN component value of the 
relationship vector of the next greater condition value 
in the condition value list. In a second alternative, 
each relationship vector component of the inserted 
condition value is set to the GREATER THAN component 

35 value of the relationship vector of the next smaller 
condition value in the condition list. Only one of the 
two alternatives applies, if the condition value is 
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inserted at the beginning (first alternative) or at the 
end C second alternative) of the condition value list. 

Adjusting 424 the relationship vectors for the at 
least one condition value of the condition Ci and each 
5 further condition value that is in the condition list 
and chat is in the value range of the condition Ci. 
Once the condition value is inserted, at least one 
relationship vector component for the at least one 
condition value is incremented by an increment to 
10 reflect the relationship operator of the condition- For 
example, the increment is 1 but any other increment can 
be used instead. For exaiqple, if the relationship 
operator of the condition is n = n , then the EQUAL TO 
component is incremented. If the relationship operator 
15 of the condition is then the EQUAL TO component 

and the GREATER THftN component are incremented. If the 
relationship operator of the condition is "< w , then the 
LESS THAN component is incremented. 

After the relationship vector for the inserted 
20 condition value has been adjusted according to the 
condition that is currently being processed, the 
increment is propagated through each relationship 
vector component of each further condition value in the 
condition list as long as the further condition value 
25 is within the value range of the currently processed 
condition. In other words , each relationship vector 
component of each further condition value in the 
condition value list is incremented if the further 
condition value belongs to a value range affected by 
30 the currently processed condition. 

Once all conditions of the Boolean expression 310 
referring to the attribute have been processed, in some 
cases the Boolean, expression 310 can be reduced 430 to 
the reduced Boolean expression 320 according to each 
35 relationship vector. Depending on the Boolean operator 
that is used for combining the conditions in the 
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Boolean expression 310 , various subsets of condition 
values can be identified. 

If the AND- operator is used, an MFD- subset 201 
(cf. FIGS* 4 and S) of condition values may be 
5 identified in the condition list. Bach of the subset 
condition values has at least one relationship vector 
component that has a value equal to the increment 
multiplied by the number of conditions in the plurality 
of conditions of the Boolean expression 310. The 

10 reduced Boolean expression 320 can be composed by 
combining the AND-subset condition values with Boolean 
operators and with relationship operators according to 
their relationship vectors. If the AND-subset 201 is 
empty, it indicates that no data records can fulfill 

15 the Boolean expression 310. in this case, a 
corresponding notification (NOTIFICATION, cf. FIG. i) 
can be sent to the user before or instead of sending 
the query. 

If the OR-operator is used, an OR-subset 202 (cf , 

20 FIG. 6) of condition values may be identified in the 
condition value list, in one implementation, the OR- 
subset 202 of condition values in the condition list 
includes at least one subset condition value that has 
at least one relationship vector component with the 

25 initial value. This OR-subset includes the condition 
values of the condition value list having a. value range 
where at least one value does not fulfill the Boolean 
expression 320. The reduced Boolean expression 320 can 
be composed by combining the OR-subset condition values 

30 with Boolean operators and with relationship operators 
according to their relationship vectors. In this 
implementation the reduced Boolean expression 320 can 
be composed by excluding the non- fulfilling values from 
the results of the reduced Boolean expression. For 

35 example, the OR-subset includes a value «5" and the 
EQUAL TO component has the initial value. Then the 
reduced Boolean expression can be w o5 a . 
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In another implementation the OR- subset of 
condition values in the condition list includes at 
least one subset condition value that haB. at least one 
relationship vector component with a value greater than 
5 the initial value, in this implementation the OR-subset 
includes the condition values of the condition value 
list having a value range where at least one value 
fulfills the Boolean expression 320 - in this 
implementation the reduce Boolean statement can be 
10 composed by including all fulfilling value in the 
reduced Boolean expression. 

If the OR- subset 202 indicates no selectivity of 
the Boolean expression at all, a corresponding 
notification (NOTIFICATION, cf . PIG- 1) can be sent to 
15 the user before or instead of sending the query. In the 
case of a human user this can be a visual notification 
(e.g., a pop-up) or a sound (e*g., a messeige or beep). 
In case of a non- human user (e.g., computing device) 
the notification may be implemented by raising an 
20 event, an error message, or a warning in a format that 
can be processed by the non- human user. 

Therefore , the logical evaluation of the Boolean 
expression 310 by the first computing device 901 can 
finally lower the workload of the second computing 
25 device 302 by preventing the sending of gurries that do 
not deliver useful results. 

Those skilled in the art can apply the method also 
to multidimensional attributes, such as vectors or 
matrices. For such attributes the method 400 is 
30 performed for each attribute coinponent in tbe described 
way. 

FIG. 4 illustrates a first example of applying the 
method 400 referring to a numerical attribute. 
35 111 thie first example, the computer system 990 has 

received 410 a Boolean expression 310 including the 
following conditions referring to an attribute n: 
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CI': n > 5 

t 

C2s n <= 30 

C3: n between 25 and 70 

All three conditions are combined with the Boolean 
5 AND operator. 

For example, the first computing device 901 runs a 
logical evaluation of the Boolean expression 310 to 
determine whether the three conditions can be 
summarized by fewer conditions and whether it is 

10. logically the case that either zero data records or all 
data records stored in the *. second computer device will 
match the set of conditions. 

List elements of the condition value list are 
illustrated as boxes having four portions. The upper 

15 portion includes the condition value. The lower three 
portions represent three components of the. relationship 
vector for the condition value. The lower left portion 
includes the LESS THAN component ("<" counter) , the 
lower middle portion includes the EQUAL TO component 

20 ("so counter), and the lower right portion includes the 
GREATER THAN component (">* counter) . The notation 

condition value/ ( n <° counter, "awcounterr °> n counter) 
for elements of the condition value liBt is used in all 
of the following examples. For each condition, two rows 

25 are shown in the condition value list. The upper row 
shows the condition values with their relationship 
vectors after extracting 421, inserting 422, and 
initializing 423 with respect to the current condition. 
The lower row illustrates the status after adjusting 

30 424 <Cf . PIG. 3) . 

In the set of conditions, the four condition 
values n S n r "30" , n 25 n , and "70" occur. Therefore, 
after all the condition values have been inserted, the 
condition value list contains four list elements. 
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First, condition Cl is processed. $[<he condition 
value »5 B is inserted ana the relationship vector is 
initialized to the initial value (o,o,o) because no 
other condition value is included in the condition list 
5 at that time. The relationship operator of Cl is ">», 
Therefore, the counter is incr«smented. The 

increment used in the example is l. The list element 
5/(o f o,i) now reflects the condition "> 5" with 
reference to the attribute n of the -condition Cl . ' 
10 Then condition C2 is processed. The condition 

value "30" is inserted and its relationship vector is 
initialized (INIT) with the GREATER THAN component 
value "l" of the relationship vector of the next 
smaller condition value "5". Now the whole condition 
15 value list represents the condition "> 5». Tlie 
initialized relationship vector (1,1, 1) is then 
adjusted by incrementing the "<:» counter and the "=»» 
counter. The list element 30/(2,2,1) now reflects the 
condition "<:*= 30" with reference to the attribute n of 
20 the condition C2. Then the increment is propagated 
(INCR) to the relationship vector of the list element 
5/(0,0,1), whose condition value «S° is in the value 
range of condition C2. ThiB results in the adjusted 
list element 5/(1,1,2). Now the condition value list 
25 represents condition C2.in addition to condition Cl. 

Finally condition C3 is processed. Both condition 
values -25" and "70" are inserted in the condition 
value list in sorted order. After initialisation (INIT) 
of the respective relationship vectors., the list 
elements 25/(2,2,2) and 70/(1,1,1) are included so that 
the whole condition value list still represents 
condition C2 in addition to condition Cl. The 
relationship vectors, of the new inserted condition 
values are the adjusted to reflect condition C3. The 
35 condition C3 "between 25 and 70" is a combination of 
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the two conditions n >.25" and n < 70 n . Therefore, the 
corresponding relationship vector components are 
incremented accordingly, resulting in list elements 
25/(2,3,3) and 70/(2,2,1). Because the condition value 
5 "30" is within the value range of condition C3, the 
increment is propagated to its corresponding 
relationship vector, resulting in the list element 
30/(3,3,2) * Now the whole condition value list 
represents all three conditions CI, C2, and C3 ("> 5", 

10 "< 30 w , and "between 25 and 70" with reference to 
attribute n) • 

The first computing device 901 may now identify 
the AND - subset 201 that includes condition values in 
the condition value list where each of the subset 

15 condition values has at least one relationship vector 
component that has a value equal to the increment 
multiplied by the number of conditions in the 
plurality. In the example, the increment is "1" and the 
number of conditions is "3» • That is, any list element 

20 having a relationship vector that includes at least one 
vector component with a value of 3 is part of the 
intersection of all three conditions. As the list 
elements 25/(2,3,3) and 30/(3,3,2) fulfill this 
requirement, they form the AND- subset 201. 

25 Based cm the and -subset 201, the reduced Boolean 

expression 320 can be composed because the three 
conditions of the example can be reduced to a single 
condition: 25 < n < 30 In between 25 and 30) . 

30 PIG* 5 illustrates a second example of applying the 
method 400 referring to the numerical attribute n. 

In this second example, the computes: system 930 
has received 410 a Boolean expression 310 including the 
following conditions referring again to the attribute 

35 n: . 
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CI' : n = 100 

C2»s ii not between 70 and 90 
C3 • : n > SO 
C4» s n < 10 
5 C5 • : n <> 55 

All five conditions are combined with the Boolean 
OR operator, for example, by selecting the OR radio 
button in the GUI of FIG. 2 and entering the conditions 
for the attribute accordingly. 
io The list element for the condition value "100" of 

condition Cl» is adjusted after insertion and 
initialization to reflect the "=« relational operator 
by incrementing the counter in the relationship 
vector. This results in the list element 100/(0,1,0) . 
15 The two condition value b "70" arid "so" are 

extracted from condition C2 • . The relational operator 
"not between" corresponds to "<" AND ">". Therefore, 
the °<"oounter of the condition value »70» an d the 
"> "counter of the condition value "90" are incremented 
20 accordingly, resulting in the list elements 70/(1,0,0) 
and 90/(0,0,1). To consistently reflect the conditions 
CX- and C2« in the condition value list, the third list 
element having the condition value »ioo n (which is in 
the value range of condition C2 ' ) is adjusted by 
25 incrementing (INCR) the relationship vector components. 
The result is 100/(1,2,1) . 

condition C3' includes all previously inserted 
condition values in itB value range. Therefore, after 
having initialized (1NIT) the list elemiant for the 
30 condition value "60" to 60/ (1,1,1) and after having 
adjusted the list element to 60/(1,2,2), the increment 
is propagated (INCR) to all further list elements. This 
results in the adjusted list elements 70/(2,1,1), 
90/(1,1,2)), and 100/(2,3,2). 



Pmn-f 'OOP. D (TOQ 



04-JUN-2003 10:37 SflP RG URLLDORF +49 6227 764433 S. 24/46 

2003P00256EP 17 

The value range of condition C4 V do&s not affect 
any of the previously inserted li&t elements. 
Therefore, only the list element . including the 
condition value "10 *! needs to be adjusted to reflect 
5 the "<* operator by incrementing the "< n counter after 
initialization (INIT) . The result is the list element 
10/(2,1,1). 

Condition C5 1 excludes the condition value "95" 
from the result records of the Boolean expression 310. 

10 However, all previously inserted list elements fall 
within the value range of condition Therefore, 
after initialization (HUT) to 95/(2,2,2) the 
correspo n ding °<" counter and the °> ^counter are 
incremented to 95/(3/2/3) and the increment is 

15 propagated (INCR) to the relationship veptors of all 
other list elements in the condition valine list. The 
final, result of the condition value list Including all 
five conditions is: 10/(3 r 2,2), 60/(2,3,3),. 70/(3,2,2), 
90/(2,2,3), 95/(3/2/3), and 100/ (3,4,3) . 

2 0 as the Boolean eacpres sion comb ine s the f ive 

conditions with OR, at least one of the five conditions 
must match, that is, the relationship vector components 
must at least have the value of an increment (1 in the' 
example) . Since all vector components of all list 

25 elements in PIG. 5 have a value of at least "1", at 
least all the condition values between "10" and n 100 n . 
lie in the matching value range. However, in the list 
element for the condition value "100", the ^"counter 
is also greater than 1. Therefore, all values greater 

30 than "100" also match the five conditions* Similarly, 
in the list element for the condition value "10", the 
"<" counter is greater than l, so all values less than 
10 also match the five conditions. Therefore, an OR- 
subset including condition values having at least one 
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vector component that has the initial value M 0 n cannot 
be identified. 

The conclusion is that all possible values match 
the five conditions Cl« to C5' when combined with OR, 
5 and thus the selectivity is zero. A corresponding 
-reduced Boolean expression is not meaningful in this 
case, since all records available in the second 
computing device 902 match the five conditions. 
Therefore, it is not worthwhile sending a corresponding 
10 query. For example, the first computing device 901 can 
suppress the sending of the query. In addition, the 
first computing device 901 may send a corresponding 
notification to the user by, for example, displaying an 
information popup for the user. 
15 If the five conditions were combined with AND, the 

corresponding AND- subset would be empty, since none of 
•the relationship vector components of any of the list 
elements has a counter value of »5» (number of 
conditions times the increment) . So independently of 
20 the given data it is impossible from a logical point of 
view to fulfill all five conditions at the same time. 
Therefore, it makes no sense to send such a query 
statement, which might also include further conditions, 
to the second computing device. The first computing 
25 device can simply inform the user using the GUI about 
that I this case the result set is empty, for example, 
with a pop-up or a corresponding sound message. The 
second computing device is freed from useless workload 
and can use the corresponding computing capacity 
30 otherwise. 

FIG. 6 illustrates a third example of applying the 
method 400, in this case referring to an alphanumeric 
attribute. 



Fmtvf TO wrM/nR/9nnQ m-SR 



Fmr>{ nr 'OOP. D (TOR 



04-JUN-2003 10:37 SRP AG UflLLDQRF +49 62?? 764433 S. 

2003P00256EP 19 

The method 400 and the corresponding algorithms as 
discussed in FIGs. 4 and S may also he applied to 
alphanumeric data. The example of FIG. 6 uses family 
names (LAST NAME, cf. PIQ. 2) as an example of an 
5 alphanumeric attribute. Not all operators are covered 
in this example. Theoretically, all relational 
operators already used in FIGs. 4,5 may be used. 
However, in practice it is rather unusual for value 
ranges (like LAST NAME < "Smith" or LAST NAME between 
10 -Miller" and "Smith") to be selected on alphanumeric 
attributes . 

In contrast to searches using numerical 
attributes, wild cards such as «*« may be used with 
alphanumeric attributes. This third example illustrates 
15 the usage of a "*» wild card at the end of a string. 
Those skilled in the art can apply the findings to 
other wild card patterns or similar patterns. 

Assume that a Boolean expression ia received that 
includes the following conditions referring to the 
20 attribute a standing for LAST NAME: 

Cl"s a - Mi* 

C2 " s a a Miller 

C3 • ■ : a <> Smith 

The Boolean expression combines all three 
25 conditions with AND. 

The third example shows the successive! application 
of the three conditions and the resulting condition 
value list with the respective relationship vector 
components . 

30 Since the wild card »*«" indieatea a value range, a 

condition like ci" is represented by two list 
elements. In this example, the condition Mi*" may be 
represented as "> Mi" and "< MiZZZZZZZSIZZZZ-." - The 
upper bound of the value range may be represented as 

35 written here using the capital letter »Z», which is the 
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letter of the alphabet with the highest l^SCII number . 
The number of times W Z D is repeated equajls the. number 
of characters remaining in the string. For example, the 
value "Mi" has a length of 2 characters, rf a meta data 
5 declaration for the attribute (e.g., LAST! NAME) has a 
length of 60 characters, then 60 - 2 = 58 characters 
remain to be filled up with n Z n . 

An advantageous alternative to represtsnt the upper 
bound for the condition is to write "Mi*" for the upper 

10 bound. Here the upper bound is not defined explicitly. 
The wild card n * ff indicates only that this element is 
used a© the upper bound of a range, where it is 
implicitly clear that there is no combination of 
letters starting with "Mi" that lies beyond the upper 

15 bound. 

Xn an embodiment of the invention, the proposed 
algorithm worfcs without an explicit upper bound. A 
placeholder for an upper bound is sufficient. This is 
convenient, as it means that no string opearation has to 

20 be performed to fill up the string with chsaracters such 
as D Z B , and no meta data information needs to be 
evaluated to calculate the number of characters to add. 

The condition value list of the third example is 
created in a similar way as explained in the first and 

25 second examples. After having . processed all three 
conditions CI ■ ■ to C3 ■ » , the condition value list 
includes the list elements Mi/ (1,2,2), Miller/ (2,3 , 2) , 
MI*/ (2, 2,1), ajxd Smith/ (1,0,1) . 

The first computing device 901 may now identify 

3 0 the AND- subset 201 that includes the condition value of 
the list element Miller/ (2,3,2) because it is the only 
condition value having at least one relationship vector 
component that has a value equal to the increment (in 
this case, 1) multiplied by the number of conditions 

35 (in this case, 3) . 
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Based on the AND- subset 201, the reduced Boolean 
expression 320 can be composed because the three 
conditions of the example can be reduced to a single 
condition: a = "Miller* . The other conditions are 
5 redundant . 

If the three conditions were combined in the 
Boolean expression 310 by the Boolean OR operator, the 
first computing device may identify an OR- subset 202 of 
condition values in the condition value list having at 

10 least one . subset condition value that has a 
relationship vector with at least one relationship 
vector component being different from the initial value 
"0". In the example, this is true for the condition 
value of the list element Smith/ (1, 0, 1) . Since the 

15 «= "counter for "Smith" is still "0" but all other 
vector components have values > l, the equivalent 
condition for the reduced Boolean statement 320 , which 
can replace the three original conditions ci 1 *. to C3 ■ » , 
is: a <> Smith. The other conditions are redundant. 

20 Sometimes it is not possible to reduce the 

plurality of conditions to just one condition, but two 
or more conditions remain, each representing an 
interval which is disjoint from the other intervals,. 
Assume these disjoint intervals refer to an attribute 

25 (numerical or alphanumeric) with a cardinality small 
enough that the entire list of attribute values is 
stored locally in the first computing device. For 
example, an attribute COUNTRY may have a cardinality of 
100, and the list of loo different values for COUNTRY 

30 may be stored locally in the first computing device. 
Now assume that a set of N conditions is reduced to two 
disjoint intervals (two conditions) . If no values occur 
between the inner interval boundaries, so that these 
boundaries are in fact neighboring points, then the two 
35 disjoint intervals may be united into one result 



Empf.zeit:04/06/2003 10:37 



EmPf.nr.:295 P.028 



04-JUN-2003 10:38 SRP RG LJRLLBORF +49 6227 764433 S. 29^46 

2003P00256EP 22 

interval, with the result that the two conditions axe 
further reduced to just one condition. This principle 
may be generalized for the unification of :tnore than two 
intervals. It may occur that only some of the intervals 
5 can he merged into one interval whereas other intervals 
cannot be merged because there are valued between the 
interval boundaries that do not match this conditions. 
Even in such cases , a reduction of the original number 
of conditions is obtained, resulting in less workload 
10 for the second computing device. 

The invention can be implemented in digital 
electronic circuitry, or in computer hardware, 
firmware, software, or in combinations of them. " The 

15 invention can be implemented as a computer program 
product, i.e., a computer program tangibly embodied in 
an information carrier, e.g., in a macliine- readable 
storage device or in a propagated signal, for execution 
by, or t:o control the operation of, data processing 

20 apparatus, e.g., a programmable processor, a computer, 
or multiple computers. A computer program can be 
written in any form of programming language, including 
compiled or interpreted languages, and it can be 
deployed in any form, including as a stand-alone 

25 program or as a module, component, subroutine, or other 
unit suitable for use in a computing environment- A 
computer program can be deployed to be executed on one 
computer or on multiple computers at one site or 
distributed across multiple sites and interconnected by 

30 a communication network. 

Method steps of the invention can be performed by 
one or more programmable processors executing * a 
computer program to perform functions of the invention 
by operating on input data and generating output. 

35 Method steps can also be performed by, and apparatus of 
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the invention can be implemented as, special purpose 
logic circuitry f e.g. , an FPGA (field programmable gate 
array) or an ASIC (application-specie i:? integrated 
circuit) , 

5 Processors suitable for the execution of a 

computer program include , by way of example, both 
general and special purpose microprocessors, and any 
one or more processors of any kind of digical computer. 
Generally, a processor will receive instructions and 

10 data from a read-only memory or a random access memory 
or both. The essential elements of a computer are at 
least: one processor for executing instructions and. one 
or more memory devices for storing instructions and 
data. Generally, a computer will also include, or be 

15 operatively coupled to receive data from or transfer 
data to, or both, one or more mass storage devices for 
storing data, e.g., magnetic, magneto -optical disks, or 
optical disks . Information carriers suitable for 
embodying computer program instructions; and data 

20 include all forms of non-volatile memory, including by 
way of example semiconductor memory devices, e.g., 
EPROM, EEPROM, and flash memory devices; magnetic 
disks, e.g., internal hard disks or removable disks; 
magneto-optical disks; and CD-ROM and DVD-ROM disks. 

25 The processor and the memory can be supplemented by, or 
incorporated in special purpose logic circuitry. 

To provide for interaction with a user, the 
invention can be implemented on a computer having a 
display device, e.g., a cathode ray tube (CRT) or 

30 liquid crystal display (LCD) monitor, for displaying 
information to the user and a keyboard and a pointing 
device, e.g., a mouse or a trackball, by which the user 
can provide input to the computer. Other kinds of 
devices can be used to provide for interaction with a 

35 user as well; for example, feedback provided to the 
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user can be any form of sensory feedback,, e.g., visual 
feedback, auditory feedback, or tactile feedback? and 
input from the user can be received in any form, 
including acoustic, speech, or tactile input. 
E The invention can be implemented ijfi a computing 

system that includes a back-end component, e.g., as a 
data server, or that includes a middleware component, 
e.g., an application server, or that includes a 
front-end component, e.g., a client computer having a 
10 graphical user interface or a Web browser through which 
a user can interact with an implementation of the 
invention, or any combination of such back-end, 
middleware, or front -end components. The components of 
the system can be interconnected by any form or medium 
3.5 of digital data communication, e.g., a communication 
network. Examples of communication networks include a 
local area network (IAN) and a wide area network (WAN) , 
e.g., the Internet . 

The computing system can include clients and 
20 servers. A client and server are generally remote from 
each other and typically interact through a 
communication network. The relationship oE client and 
server arises by virtue of computer programs running on 
the respective computers and having a client -server 
25 relationship to each other. 
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Claims 

1. A computer implemented method for logically 
evaluating a Boolean expression used in a query 

5 statement , wherein the Boolean expression refers to 

an attribute and includes a plurality of conditions 
(CI, C2, C3) , comprising the steps i 
receivingr (410) the Boolean expression (310) ; 
decomposing (420) the Boolean expression (310) into 
10 the plurality of conditions (CI, C2, C3) ; 

for each condition of the plurality 

extracting (421) from the condition at least 
one condition value referri-ng to the 
attribute, wherein the at least one 
15 condition value defines a value range of 

the condition; 
inserting (422) the at least one condition 
value in a condition value liist in sorted 
order; 

20 initialising (423) a relationship vector for 

the at least one condition value; and 
adjusting (424) the relationship vectors for 
the at least one condition value and for 
each further condition value that is in the 

25 condition list and that is in the value 

range of the condition. 

2. The method of claim 1,. comprising the fvirther step 
reducing (430) the Boolean expression according to 

30 each relationship vector. 

3. The method of claims 1 or 2, wherein the extracting 
step (421) retrieves a maximum condition value 
and/ or a minimum condition value of the condition. 

35 
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4. The method of any one of claims 1 to 1, wherein the 
extracting step (421) retrieves an identity- 
condition value of the condition. 

5 5- The method of any one of the claims 1 to 4, wherein 
the relationship vector comprises a LESS THAN 
component, an EQUAL TO component and a GREATER THAN 
component . 

10 6. The method of claim 5, where the initializing step 

(423) is performed by 

setting each relationship vector component for the 
at least one condition value tc an initial 
value if the condition list has no further 
15 condition value; 

setting each relationship vector component to the 
LESS THAN component value of the relationship 
vector for the next greater condition value in 
the condition value list; or 
setting each relationship vector component to the 
GREATER THAN component value of the 
relationship vector for the next smaller 
condition value in the condition value list. 

7. The method of claim S, where the adjusting step 

(424) is performed by 2 

incrementing at least one relationship vector 
component for the at least one condition value 
by an increment to reflect the condition; and 
propagating the increment through each relationship 
vector component for each further condition 
value in the condition list as long as the 
further condition value is within the value 
range of the condition. 



25 



30 



35 
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8. The method of claim 7, wherein the reducing step 
(430) comprises: 

identifying an AND -subset (201) of condition values 
in the condition value list, where each subset: 
5 condition value has at least one relationship 

vector component that has a value • equal to the 
increment multiplied by the number of 
conditions in the plurality. 

10 9, The method of claim 8, wherein the reducing step 
(430) further comprises; 

composing a reduced Boolean expression (320) based 
on the AND- subset. 

15 10. The method of claim 7, wherein the reducing step 
(430) comprises : 

identifying an OR- subset (202) of coalition values 
in the condition value list, where each subset 
condition value has at least one relationship, 
20 vector component with the initial value - 

11. The method of claim 10 , wherein the reducing step 
(430) further comprises: 

composing a reduced Boolean expression (320) based 
25 on the OR- subset. 

12. The method of claim 8, further comprising: 

if the AND- subset (201) is empty,, sending a 
corresponding notification to a use::. 



30 



13* The method of claim 10, further comprising; 

if the OR-subset (202) is empty, sending 
corresponding notification to a user. 
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14, The method of claims 9 or 11, where the reduced 
Boolean expression (320) comprises a condition that 
merges at least a first condition **nd a second 
condition, the first and second conditions 
5 referring to the attribute and representing 

disjoint intervals , the attribute having no values 
between the dinner interval boundaries of the 
disjoint intervals. 

10 15. A computer program product for logically evaluating 
a Boolean expression used in a query statement, 
stored on a data carrier, or carried by a signal 
and comprising a plurality of instructions that 
when loaded into a memory of a computing device 

15 (301) cause at least one processor of the computing 

device (901) to execute the steps of any of the 
claims 1 to 14. 
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16. A computer system (990) for logically* evaluating a 
Boolean expression used in a query statement, 
wherein the Boolean expression (310) refers to an 
attribute and includes a plurality of conditions 
5 (Cl, C2, C3), comprising: 

a computing device (901) having a memoiry to receive 
(410) the Boolean expression and to store a 
condition value list; and 
having at least one processor for executing 
10 computer program instructions to: 

decompose (420) the Boolean expression (310) into 

the plurality of conditions (Cl, C2, C3) ; 
for each condition of the plurality 

extract (421) from the condition at least one 
15 condition value referring to the attribute, 

wherein the at least one condition value 
defines a value range of the condition; 
insert (422) the at least one condition value 
in the condition value list in sorted 
20 order; 

initialize (423) a relationship vector for the 

at least one condition value ; and 
adjust (424) the relationship vectors for the 
at least one condition value emd for each 
25 further condition value that is in the 

condition list and that is in the value 
range of the condition. 



17. The coniputer system (990) of claim 16, wherein the 
at least one processor further executes computer 
program instructions to 

reduce (430) the Boolean expression (310) according 
to each relationship vector. 
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..IB- The computer system of any one of the. claims 16 to 
17, wherein the relationship vector comprises a 
LESS THAN component, an EQUAL TO component, and a 
greater THAN component. 

5 

13* The computer Bystem of claim 18, wherein the 
computer program instructions causing the at least 
one processor to initialise (423) have: 
a first portion to set each relationship vector 
component for the at least one condition value 
to an initial value if the condition list has 
no further condition value; and 
a second portion to set each relationship vector 
component to the LESS THAN component value of 
the relationship vector for the ;aext greater 
condition value in the condition value list; or 
to set each relationship vector component to 
the GREATER THAN component value of the 
relationship vector for the next smaller 
20 condition value in the condition value list. 



10 



15 



20. The computer system of claim 19, wherein the 
computer program instructions causing the at least 
one processor to adjust (424) have: 

a first • portion to increment at least one 
relationship vector component for the at least 
one condition value by an increment: to reflect 
the condition; and 

a second portion to propagate the increment through 
each relationship vector component for each 
further condition value in the condition list 
as long as the further condition value is 
within the value range of the condition. 
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21. The computer system of claim 20 , wherein the memory 
stores 

an AND- subset (201) of condition vailues in the 
condition value list, where each subset 
5 condition value has at least one relationship 

vector component that has a value equal to the 
increment multiplied by the number of 
conditions in the plurality. 

10 22. The computer system of claim 21 , wherein the at 
least one processor executes further computer* 
program instructions to compose a reduced Boolean 
expression (320) based on the AND- subset (201) . 

15 23. The computer system of claim 20 , wherein the memory 
stores 

an OR- subset (202) of condition values in the 
condition value list, where each subset 
condition value has at least one relationship 
20 vector component with the initial value. 

24- The computer system of claim 20 r wherein the memory 
stores 

an OR-subset of condition values in the condition 
25 value list, where each subset condition value 

has at least one relationship vector component 
with a value greater than the initial value. 

25. The computer system of claims 23 or 24, wherein the 
30 at least one processor executes further computer 

program instructions to compose a reduced Boolean 
expression (320) based on the OR- subset (202) » 
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26. The computer system of claim 21, where the at least 
one processor executes further computer program 
instructions to send a' corresponding notification 
to a user, if the AND- subset (201) is empty. 

5 

27. The computer system of claim 23, where the at least 
one processor executes further computer program 
instructions to send a corresponding notification 
to a user, if the OR-subset (202) is ettpty. 

10 

28. The computer system of claims 22 or 25, where 

the memory stores a list of all vaJLues of the 
attribute; and 

the at least one processor executes further 
15 computer program instructions to m^rge at least 

a first condition and a second condition, the 
first and second conditions referring to the 
attribute and representing disjoint intervals, 
the attribute having no values between the 
20 inner interval boundaries of disjoint 

intervals . 

23. A graphical user interface implementation 
configurable to provide a graphical user interface 
25 (GUI) to a user, whereby the graphical user 

interface (GUI) is suitable for receiving from the 
user a Boolean expression (310) Eor logical 
evaluation by performing the steps of any of the 
claims 1 to 14. 



35 
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METHOD AMD COMPUTER SYSTEM FOR QUERY OPTIMIZATION 
Abstraofc of the Invention 

5 Method and computer system for logically evaluating a 
Boolean expression used in a query statement to 
optimize the query* The Boolean expression refers to an 
attribute and includes a plurality of conditions. The 
Boolean expression is received (410) and decomposed 

10 (420) into the plurality of conditions . For each 
condition at least one condition value referring to the 
attribute is extracted (421) from the condition* The at 
least one condition value is then inserted. (422) into a 
condition value list in sorted order. A relationship 

15 vector of the at least one condition value is 
initialised (423) * Then the relationship vectors of the 
at least one condition value and of each further 
condition value that is in the condition list and that 
is in the value range of the condition are adjusted 

20 (424) . After having processed all conditions the 
Boolean expression may be reducing (430) according to 
each relationship vector. 

PIG. 3 



FmPf_nr-!?95 P.D40 



B4-JUN-2003 10:39 



SRP PIG UflLLDDRF 



+49 6227 7S4433 5.41/46 



1/6 



I4J 



CM 



CO 



CD 



LU 



BOOLEAN 
EXPRESSION 


1 » 


REDUCED 
BOOLEAN 
EXPRESSION 





8 



3! 



s 

EC 



c~r>* — -oat; D Oil 1 



04-JUN-2003 10:39 



SRP AG WRLLDORF 



+49 6227 764433 S. 42^46 




Fmof ^oi+!nd/m/9nD3 1Q:3B 



Empf.nr.:295 P.042 



04-JUN-2003 10:39 



SflP AG URLLDOFF 



+49 6227 764433 S. 43^46 



3/6 



CNI 



CO 
§3 



CNI 



S3? 



CM 




Ul 

z 




Ul 




INITIALIZING 

RELATIONSHIP 
VECTOR 




ADJUSTING 
RELATIONSHIP 
VECTORS 


EXTRACTI 
CONDITION \ 




Ul g 
© 

o 







£2 

UL 



04- JUN-2003 10 : 39 



SRP RG UPLLTO^ 



+49 G22?? 764433 S. 44^46 




FmPf.nr_:?95 P. 044 



04-JUN-2003 10:39 



SflP RG UIRLLBORF 



+49 6227 764433 5. 45^46 




Enpf.zeit:04/06/2Q03 10:38 



Empf.nr.:295 P. 045 



04-JUN-2003 10:35 



+49 .6227 764433 46/46 




EmPf.zeit:04/06/2003 10:39 



Enpf.nr.:295 P. 046 



RFRPMT SFTTFN 46 



T/EP :004/U50988 



This Page is Inserted by IFW Indexing and Scanning 
Operations and is not part of the Official Record 



Defective images within this document are accurate representations of the original 
documents submitted by the applicant. 

Defects in the images include but are not limited to the items checked: 



U BLACK BORDERS 

□ IMAGE CUT OFF AT TOP, BOTTOM OR SIDES 

□ FADED TEXT OR DRAWING 

□ BLURRED OR ILLEGIBLE TEXT OR DRAWING 

□ SKEWED/SLANTED IMAGES 

□ COLOR OR BLACK AND WHITE PHOTOGRAPHS 

□ GRAY SCALE DOCUMENTS 

□ LINES OR MARKS ON ORIGINAL DOCUMENT 

□ REFERENCE(S) OR EXHIBIT(S) SUBMITTED ARE POOR QUALITY 

□ OTHER: 

IMAGES ARE BEST AVAILABLE COPY. 
As rescanning these documents will not correct the image 
problems checked, please do not report these problems to 
the IFW Image Problem Mailbox. 



BEST AVAILABLE IMAGES 




